Draft
Conversation
… stats bugs Group A of SchemaAnalyzer refactor: - Fix A1: array element stats overwrite bug (isNewTypeEntry) - Fix A2: probability >100% for array-embedded objects (x-documentsInspected) - Rename folder: src/utils/json/mongo/ → src/utils/json/data-api/ - Rename enum: MongoBSONTypes → BSONTypes - Rename file: MongoValueFormatters → ValueFormatters - Add 9 new tests for array stats and probability
Group B of SchemaAnalyzer refactor: - B1: SchemaAnalyzer class with addDocument(), getSchema(), reset(), getDocumentCount() - B2: clone() method using structuredClone for schema branching - B3: addDocuments() batch convenience method - B4: static fromDocument()/fromDocuments() factories (replaces getSchemaFromDocument) - B5: Migrate ClusterSession to use SchemaAnalyzer instance - B6-B7: Remove old free functions (updateSchemaWithDocument, getSchemaFromDocument) - Keep getPropertyNamesAtLevel, getSchemaAtPath, buildFullPaths as standalone exports
…x properties type Group C of SchemaAnalyzer refactor: - C1: Add typed x-minValue, x-maxValue, x-minLength, x-maxLength, x-minDate, x-maxDate, x-trueCount, x-falseCount, x-minItems, x-maxItems, x-minProperties, x-maxProperties to JSONSchema interface - C2: Fix properties type: properties?: JSONSchema → properties?: JSONSchemaMap - C3: Fix downstream type errors in SchemaAnalyzer.test.ts (JSONSchemaRef casts)
…temBsonType Group D of SchemaAnalyzer refactor: - D1: Add bsonType to FieldEntry (dominant BSON type from x-bsonType) - D2: Add bsonTypes[] for polymorphic fields (2+ distinct types) - D3: Add isOptional flag (x-occurrence < parent x-documentsInspected) - D4: Add arrayItemBsonType for array fields (dominant element BSON type) - D5: Sort results: _id first, then alphabetical by path - D6: Verified generateMongoFindJsonSchema still works (additive changes) - G4: Add 7 getKnownFields tests covering all new fields
… toFieldCompletionItems) Group E of SchemaAnalyzer refactor: - E1: generateDescriptions() — post-processor adding human-readable description strings with type info, occurrence percentage, and min/max stats - E2: toTypeScriptDefinition() — generates TypeScript interface strings from JSONSchema for shell addExtraLib() integration - E3: toFieldCompletionItems() — converts FieldEntry[] to CompletionItemProvider- ready FieldCompletionData[] with insert text escaping and $ references Also: - Rename isOptional → isSparse in FieldEntry and FieldCompletionData (all fields are implicitly optional in MongoDB API / DocumentDB API; isSparse is a statistical observation, not a constraint) - Fix lint errors (inline type specifiers) - 18 new tests for transformers + updated existing tests
- Add 5 tests for clone(), reset(), fromDocument(), fromDocuments(), addDocuments() - Mark all checklist items A-G as complete, F1-F2 as deferred - Add Manual Test Plan section (§14) with 5 end-to-end test scenarios - Document clone() limitation with BSON Binary types (structuredClone)
- Add monotonic version counter to SchemaAnalyzer (incremented on mutations) - Cache getKnownFields() with version-based staleness check - Add ClusterSession.getKnownFields() accessor (delegates to cached analyzer) - Wire collectionViewRouter to use session.getKnownFields() instead of standalone function - Add ext.outputChannel.trace for schema accumulation and reset events
Co-authored-by: tnaum-ms <171359267+tnaum-ms@users.noreply.github.com>
…ypeScript definitions and completion items
Move SchemaAnalyzer, JSONSchema types, BSONTypes, ValueFormatters, and getKnownFields into packages/schema-analyzer as @vscode-documentdb/schema-analyzer. - Set up npm workspaces (packages/*) and TS project references - Update all extension-side imports to use the new package - Configure Jest multi-project for both extension and package tests - Remove @vscode/l10n dependency from core (replaced with plain Error) - Fix strict-mode type issues (localeCompare bug, index signatures) - Update .gitignore to include root packages/ directory - Add packages/ to prettier glob
…itions The bsonToTypeScriptMap emits non-built-in type names (ObjectId, Binary, Timestamp, etc.) without corresponding import statements or declare stubs. Currently harmless since the output is for display/hover only, but should be addressed if the TS definition is ever consumed by a real TS language service. Addresses PR #506 review comment from copilot.
…ion names - Prefix with _ when PascalCase result starts with a digit (e.g. '123abc' → '_123abcDocument') - Fall back to 'CollectionDocument' when name is empty or only separators - Filter empty segments from split result - Add tests for edge cases Addresses PR #506 review comment from copilot.
Add comment explaining why the cast to JSONSchema is safe: our SchemaAnalyzer never produces boolean schema refs. Notes that a typeof guard should be added if the function is ever reused with externally-sourced schemas. Addresses PR #506 review comment from copilot.
…lashes - Replace SPECIAL_CHARS_PATTERN with JS_IDENTIFIER_PATTERN for proper identifier validity check (catches dashes, brackets, digits, quotes, etc.) - Escape embedded double quotes and backslashes when quoting insertText - Add tests for all edge cases (dashes, brackets, digits, quotes, backslashes) - Mark future-work item #1 as resolved; item #2 (referenceText/$getField) remains open for aggregation completion provider phase Addresses PR #506 review comment from copilot.
…PI alongside MongoDB API
These debug logs were used during development and logged full editor text and completion items to the webview console on every keystroke. Removed to avoid unnecessary noise and potential data exposure.
…nd include JS globals
Documentation links like [DocumentDB Docs](https://...) were not rendered as
clickable hyperlinks in Monaco's completion detail panel. Monaco requires
{ value: string, isTrusted: true } on MarkdownStrings to enable link rendering.
Set isTrusted: true on operator documentation MarkdownStrings in
mapOperatorToCompletionItem. This is safe because the documentation content
comes entirely from documentdb-constants (operator descriptions we control),
not from user-generated content.
Previously, ClusterSession reset the SchemaAnalyzer when the user changed their query. This meant queries returning 0 results left the autocompletion field list empty. Now the SchemaAnalyzer accumulates field knowledge monotonically across queries within the same session — new fields are added, type statistics enriched. Trade-off: type statistics represent aggregated observations across all queries, not a single query snapshot. This is acceptable since the UI shows approximate type info (e.g., 'mostly String') rather than absolute percentages. Added a future work discussion in docs/plan/future-work.md about potential strategies for separating cumulative vs. per-query statistics if needed.
$not is a field-level operator (e.g., { price: { $not: { $gt: 1.99 } } }),
not a root-level logical combinator like $and/$or/$nor. It was incorrectly
included in KEY_POSITION_OPERATORS, causing it to appear at query root (where
it's invalid) and be hidden at operator position (where users need it).
Changes:
- Remove '$not' from KEY_POSITION_OPERATORS in completionKnowledge.ts
- Update JSDoc to document why $not is excluded
- Update tests: expect $not at operator position, not at key position
At value position in the project editor, show 1 (include) and 0 (exclude) instead of filter-specific completions (operators, BSON constructors, etc.). At value position in the sort editor, show 1 (ascending) and -1 (descending). These are the most common values for projection and sort fields. Projection operators like $slice and $elemMatch remain available via operator-position completions for advanced use cases.
When the user clears the editor content (removing the initial '{ }'), field
completions now insert '{ fieldName: $1 }' instead of 'fieldName: $1' to
produce valid query syntax. Operator snippets already include their own braces
and are not double-wrapped.
A 'needsWrapping' flag is computed in registerLanguage.ts by checking whether
the editor text contains a '{' character. When true, field completions in the
'all completions' fallback path get wrapped with outer braces.
Added ':', ',', and '[' to the completion provider's triggerCharacters list. These positions are already handled by the cursor context parser (value after ':', new key after ',', array element after '[') but previously required manual Ctrl+Space invocation. Added string-literal detection (isCursorInsideString) to suppress completions when trigger characters appear inside string values. Uses a forward scan counting unescaped quotes to determine if the cursor is inside a string. When inside a string, returns empty suggestions to prevent the popup.
Extended the HoverProvider to show type information when hovering over field
names in the query editor. When a field is recognized from the completion
store (populated by SchemaAnalyzer), the hover shows:
- Field name (bold)
- BSON type (e.g., Number, String, Date)
- Sparse indicator when the field is not present in all documents
Operators/BSON constructors take priority over field names to avoid
ambiguity. Statistics use relative language ('sparse') rather than
absolute numbers since the SchemaAnalyzer accumulates data across queries.
Also set isTrusted: true on operator hover content to make doc links
clickable (consistent with the completion documentation fix).
…ypes
Three hover provider fixes:
1. Quoted string keys: Hover now works for quoted field names like
{"address.street": 1}. Monaco's getWordAtPosition treats quotes/dots
as word boundaries, so a new extractQuotedKey helper manually extracts
the full quoted key from the line content.
2. isTrusted on field hovers: Field hover content now has isTrusted: true,
making any future links in field hovers clickable. (Operator hovers
already had this from a previous commit.)
3. Redesigned field hover format:
- Field name bold, with 'sparse: not present in all documents' in
subscript on the same line (no em-dashes)
- 'Inferred Types' bold section header
- Comma-separated type list (using displayTypes from all observed
BSON types for polymorphic fields)
Also threaded bsonTypes/displayTypes through FieldCompletionData from
FieldEntry for polymorphic field support.
Added tests verifying that operator categories appear at the correct
completion positions:
- Key position: only logical combinators ($and, $or, $nor) and meta
operators ($comment, $expr, etc.) — no field-level operators
- Value position: all field-level categories — comparison ($gt, $eq),
evaluation ($regex), element ($exists, $type), array ($all,
$elemMatch, $size), and field-level $not
- Operator position: same as value position
This confirms $all is correctly excluded from key position (it's a
field-level array operator: { tags: { $all: [...] } }), not a root-level
query combinator.
When a field completion inserts 'rating: ', the completion popup did not reappear for the value position. Now, typing a space after ':' or ',' triggers the suggestion popup after 50ms. This provides a smooth autocomplete flow: select field → space → see value suggestions. Implemented via onDidChangeModelContent listener that detects single-space insertions preceded by ':' or ',' and programmatically calls editor.action.triggerSuggest. Wired into all three editors (filter, project, sort) with proper cleanup.
Monaco renders hover markdown links as <a> tags, but the webview CSP blocks direct navigation to external URLs. Added a delegated click handler on the query editor container that intercepts <a> clicks with http/https hrefs and routes them through the existing trpcClient.common.openUrl mutation, which calls vscode.env.openExternal on the extension host side.
EMPTY editor (no braces) now shows key-position completions (fields +
root operators) with { } wrapping, instead of showing all operators.
UNKNOWN context remains as the full discovery fallback.
Changes:
- createCompletionItems: route needsWrapping+unknown to new
createEmptyEditorCompletions (key-position items with wrapping)
- createAllCompletions: now pure UNKNOWN fallback (no needsWrapping param)
- New tdd/ folder with behavior spec (readme.completionBehavior.md) and
26 category-based TDD tests verifying the completion matrix
- Updated existing category tests: KEY position allows 'evaluation'
(because $expr/$text are key-position operators), UNKNOWN now shows
everything
- Updated completions/README.md: added Empty position, fixed flow docs
The TDD tests check categories (from description label) and sortText
prefixes, not specific operator names, for resilience to
documentdb-constants changes.
…pletions Added `standalone?: boolean` to `OperatorEntry`. When `false`, the operator is excluded from completion lists but remains in the registry for hover docs. Operators marked as standalone: false: - Geospatial sub-operators: $box, $center, $centerSphere, $geometry, $maxDistance, $minDistance, $polygon (only valid inside $geoWithin/$near) - Positional projection: $ (not a standalone filter/sort operator) - Sort modifier: $natural (not valid as a filter value) Changes: - packages/documentdb-constants/src/types.ts: added `standalone` field - packages/documentdb-constants/scripts/generate-from-reference.ts: parse `- **Standalone:** false` from overrides, emit in generated code. Also fixed bitwise BSON type from 'int' to 'int32' to match SchemaAnalyzer. - packages/documentdb-constants/resources/overrides/operator-overrides.md: added standalone: false overrides for 9 operators - src/webviews/documentdbQuery/completions/createCompletionItems.ts: filter `e.standalone !== false` in all three completion builders (all/value/operator)
…ab stops ESC key handling (MonacoEditor.tsx): - Added context precondition '!suggestWidgetVisible && !inSnippetMode' to the Escape command so Monaco's built-in handlers dismiss the suggest widget or exit snippet mode before our handler fires. Tab key handling (MonacoAutoHeight.tsx): - Replaced onKeyDown Tab interception with addAction using precondition '!inSnippetMode'. During snippet tab-stop navigation, Monaco's built-in Tab handler takes over. After the snippet session ends (final tab stop or ESC), Tab reverts to moving focus out.
Field names originate from user database schema and should not be rendered as trusted markdown. This change: - Removes isTrusted from field hovers (keeps supportHtml for formatting) - Escapes markdown metacharacters in field names and type strings - Adds escapeMarkdown utility in src/webviews/utils/ for reuse - Updates tests accordingly
Moves extractQuotedKey and tryMatchAsClosingQuote from registerLanguage.ts into a dedicated extractQuotedKey.ts module. This decouples the pure string helper from the Monaco registration wiring, making tests less brittle and enabling easier reuse.
…mentation link handling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Shell Integration — DocumentDB Query Language & Autocomplete
Umbrella PR for the shell integration feature: a custom
documentdb-queryMonaco language with intelligent autocomplete, hover docs, and validation across all query editor surfaces (filter, project, sort, aggregation, shell).Work is organized as incremental steps, each delivered via a dedicated sub-PR merged into
feature/shell-integration.Progress
SchemaAnalyzer(JSON Schema output, incremental merge, 24 BSON types)@vscode-documentdb/schema-analyzerpackage, enrichedFieldEntrywith BSON types, added schema transformers, introduced monorepo structuredocumentdb-constantsPackage · feat: add documentdb-constants package — operator metadata for autocomplete #513 — 308 operator entries (DocumentDB API query operators, update operators, stages, accumulators, BSON constructors, system variables) as static metadata for autocompletedocumentdb-querycustom language with JS Monarch tokenizer (no TS worker), validated via POC across 8 test criteriaCompletionItemProvider· feat: documentdb-query language — CompletionItemProvider, HoverProvider, acorn validation #518 —documentdb-querylanguage registration, per-editor model URIs, completion data store,CompletionItemProvider(filter/project/sort),HoverProvider,acornvalidation,$-prefix fix, query parser replacement (shell-bson-parser), type-aware operator sorting, legacy JSON Schema pipeline removalcompletions/folderCompletionItemProviderCompletionItemProviderKey Architecture Decisions
documentdb-querycustom language — JS Monarch tokenizer, no TS worker (~400-600 KB saved)CompletionItemProvider+ URI routing (documentdb://{editorType}/{sessionId})documentdb-constantsbundled at build time; field data pushed via tRPC subscriptionacorn.parseExpressionAt()for syntax errors;acorn-walk+documentdb-constantsfor identifier validationlanguage="json"with JSON Schema validationlanguage="javascript"with full TS service +.d.tsviaaddExtraLib()Sub-PRs